symmetric loss
\epsilon -Softmax: Approximating One-Hot Vectors for Mitigating Label Noise
Noisy labels pose a common challenge for training accurate deep neural networks. To mitigate label noise, prior studies have proposed various robust loss functions to achieve noise tolerance in the presence of label noise, particularly symmetric losses. However, they usually suffer from the underfitting issue due to the overly strict symmetric condition. In this work, we propose a simple yet effective approach for relaxing the symmetric condition, namely **$\epsilon$-softmax**, which simply modifies the outputs of the softmax layer to approximate one-hot vectors with a controllable error $\epsilon$. Essentially, ***$\epsilon$-softmax** not only acts as an alternative for the softmax layer, but also implicitly plays the crucial role in modifying the loss function.* We prove theoretically that **$\epsilon$-softmax** can achieve noise-tolerant learning with controllable excess risk bound for almost any loss function. Recognizing that **$\epsilon$-softmax**-enhanced losses may slightly reduce fitting ability on clean datasets, we further incorporate them with one symmetric loss, thereby achieving a better trade-off between robustness and effective learning. Extensive experiments demonstrate the superiority of our method in mitigating synthetic and real-world label noise.
On Symmetric Losses for Robust Policy Optimization with Noisy Preferences
Nishimori, Soichiro, Zhang, Yu-Jie, Lodkaew, Thanawat, Sugiyama, Masashi
Optimizing policies based on human preferences is key to aligning language models with human intent. This work focuses on reward modeling, a core component in reinforcement learning from human feedback (RLHF), and offline preference optimization, such as direct preference optimization. Conventional approaches typically assume accurate annotations. However, real-world preference data often contains noise due to human errors or biases. We propose a principled framework for robust policy optimization under noisy preferences, viewing reward modeling as a classification problem. This allows us to leverage symmetric losses, known for their robustness to label noise in classification, leading to our Symmetric Preference Optimization (SymPO) method. We prove that symmetric losses enable successful policy optimization even under noisy labels, as the resulting reward remains rank-preserving -- a property sufficient for policy improvement. Experiments on synthetic and real-world tasks demonstrate the effectiveness of SymPO.
\epsilon -Softmax: Approximating One-Hot Vectors for Mitigating Label Noise
Noisy labels pose a common challenge for training accurate deep neural networks. To mitigate label noise, prior studies have proposed various robust loss functions to achieve noise tolerance in the presence of label noise, particularly symmetric losses. However, they usually suffer from the underfitting issue due to the overly strict symmetric condition. In this work, we propose a simple yet effective approach for relaxing the symmetric condition, namely ** \epsilon -softmax**, which simply modifies the outputs of the softmax layer to approximate one-hot vectors with a controllable error \epsilon . Essentially, *** \epsilon -softmax** not only acts as an alternative for the softmax layer, but also implicitly plays the crucial role in modifying the loss function.*
Density-Based Algorithms for Corruption-Robust Contextual Search and Convex Optimization
Leme, Renato Paes, Podimata, Chara, Schneider, Jon
We study the problem of contextual search, a generalization of binary search in higher dimensions, in the adversarial noise model. Let $d$ be the dimension of the problem, $T$ be the time horizon and $C$ be the total amount of adversarial noise in the system. We focus on the $\epsilon$-ball and the absolute loss. For the $\epsilon$-ball loss, we give a tight regret bound of $O(C + d \log(1/\epsilon))$ improving over the $O(d^3 \log(1/\epsilon) \log^2(T) + C \log(T) \log(1/\epsilon))$ bound of Krishnamurthy et al (Operations Research '23). For the absolute loss, we give an efficient algorithm with regret $O(C+d \log T)$. To tackle the absolute loss case, we study the more general setting of Corruption-Robust Convex Optimization with Subgradient feedback, which is of independent interest. Our techniques are a significant departure from prior approaches. Specifically, we keep track of density functions over the candidate target vectors instead of a knowledge set consisting of the candidate target vectors consistent with the feedback obtained.
A Symmetric Regressor for MRI-Based Assessment of Striatal Dopamine Transporter Uptake in Parkinson's Disease
Al, Walid Abdullah, Yun, Il Dong, Bae, Yun Jung
Dopamine transporter (DAT) imaging is commonly used for monitoring Parkinson's disease (PD), where striatal DAT uptake amount is computed to assess PD severity. However, DAT imaging has a high cost and the risk of radiance exposure and is not available in general clinics. Recently, MRI patch of the nigral region has been proposed as a safer and easier alternative. This paper proposes a symmetric regressor for predicting the DAT uptake amount from the nigral MRI patch. Acknowledging the symmetry between the right and left nigrae, the proposed regressor incorporates a paired input-output model that simultaneously predicts the DAT uptake amounts for both the right and left striata. Moreover, it employs a symmetric loss that imposes a constraint on the difference between right-to-left predictions, resembling the high correlation in DAT uptake amounts in the two lateral sides. Additionally, we propose a symmetric Monte-Carlo (MC) dropout method for providing a fruitful uncertainty estimate of the DAT uptake prediction, which utilizes the above symmetry. We evaluated the proposed approach on 734 nigral patches, which demonstrated significantly improved performance of the symmetric regressor compared with the standard regressors while giving better explainability and feature representation. The symmetric MC dropout also gave precise uncertainty ranges with a high probability of including the true DAT uptake amounts within the range.
Performances of Symmetric Loss for Private Data from Exponential Mechanism
Bi, Jing, Suppakitpaisarn, Vorapong
This study explores the robustness of learning by symmetric loss on private data. Specifically, we leverage exponential mechanism (EM) on private labels. First, we theoretically re-discussed properties of EM when it is used for private learning with symmetric loss. Then, we propose numerical guidance of privacy budgets corresponding to different data scales and utility guarantees. Further, we conducted experiments on the CIFAR-10 dataset to present the traits of symmetric loss. Since EM is a more generic differential privacy (DP) technique, it being robust has the potential for it to be generalized, and to make other DP techniques more robust.
CryoAI: Amortized Inference of Poses for Ab Initio Reconstruction of 3D Molecular Volumes from Real Cryo-EM Images
Levy, Axel, Poitevin, Frédéric, Martel, Julien, Nashed, Youssef, Peck, Ariana, Miolane, Nina, Ratner, Daniel, Dunne, Mike, Wetzstein, Gordon
Cryo-electron microscopy (cryo-EM) has become a tool of fundamental importance in structural biology, helping us understand the basic building blocks of life. The algorithmic challenge of cryo-EM is to jointly estimate the unknown 3D poses and the 3D electron scattering potential of a biomolecule from millions of extremely noisy 2D images. Existing reconstruction algorithms, however, cannot easily keep pace with the rapidly growing size of cryo-EM datasets due to their high computational and memory cost. We introduce cryoAI, an ab initio reconstruction algorithm for homogeneous conformations that uses direct gradient-based optimization of particle poses and the electron scattering potential from single-particle cryo-EM data. CryoAI combines a learned encoder that predicts the poses of each particle image with a physics-based decoder to aggregate each particle image into an implicit representation of the scattering potential volume. This volume is stored in the Fourier domain for computational efficiency and leverages a modern coordinate network architecture for memory efficiency. Combined with a symmetrized loss function, this framework achieves results of a quality on par with state-of-the-art cryo-EM solvers for both simulated and experimental data, one order of magnitude faster for large datasets and with significantly lower memory requirements than existing methods.
Memorization in Deep Neural Networks: Does the Loss Function matter?
Deep Neural Networks, often owing to the overparameterization, are shown to be capable of exactly memorizing even randomly labelled data. Empirical studies have also shown that none of the standard regularization techniques mitigate such overfitting. We investigate whether the choice of the loss function can affect this memorization. We empirically show, with benchmark data sets MNIST and CIFAR-10, that a symmetric loss function, as opposed to either cross-entropy or squared error loss, results in significant improvement in the ability of the network to resist such overfitting. We then provide a formal definition for robustness to memorization and provide a theoretical explanation as to why the symmetric losses provide this robustness. Our results clearly bring out the role loss functions alone can play in this phenomenon of memorization.